home *** CD-ROM | disk | FTP | other *** search
Wrap
SSSSTTTTAAAATTTTEEEESSSS((((1111)))) SSSSTTTTAAAATTTTEEEESSSS ((((JJJJuuuunnnn 6666,,,, 1111999999997777)))) SSSSTTTTAAAATTTTEEEESSSS((((1111)))) NNNNAAAAMMMMEEEE states - awk alike text processing tool SSSSYYYYNNNNOOOOPPPPSSSSIIIISSSS ssssttttaaaatttteeeessss [----hhhhVVVV] [----DDDD _v_a_r====_v_a_l] [----ffff _f_i_l_e] [----oooo _o_u_t_p_u_t_f_i_l_e] [----ssss _s_t_a_r_t_s_t_a_t_e] [----WWWW _l_e_v_e_l] [_f_i_l_e_n_a_m_e ...] DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN SSSSttttaaaatttteeeessss is an awk-alike text processing tool with some state machine extensions. It is designed for program source code highlighting and to similar tasks where state information helps input processing. At a single point of time, SSSSttttaaaatttteeeessss is in one state, each quite similar to awk's work environment, they have regular expressions which are matched from the input and actions which are executed when a match is found. From the action blocks, ssssttttaaaatttteeeessss can perform state transitions; it can move to another state from which the processing is continued. State transitions are recorded so ssssttttaaaatttteeeessss can return to the calling state once the current state has finished. The biggest difference between ssssttttaaaatttteeeessss and awk, besides state machine extensions, is that ssssttttaaaatttteeeessss is not line-oriented. It matches regular expression tokens from the input and once a match is processed, it continues processing from the current position, not from the beginning of the next input line. OOOOPPPPTTTTIIIIOOOONNNNSSSS ----DDDD _v_a_r====_v_a_l,,,, --------ddddeeeeffffiiiinnnneeee====_v_a_r====_v_a_l Define variable _v_a_r to have string value _v_a_l. Command line definitions overwrite variable definitions found from the config file. ----ffff _f_i_l_e,,,, --------ffffiiiilllleeee====_f_i_l_e Read state definitions from file _f_i_l_e. As a default, ssssttttaaaatttteeeessss tries to read state definitions from file ssssttttaaaatttteeeessss....sssstttt in the current working directory. ----hhhh,,,, --------hhhheeeellllpppp Print short help message and exit. ----oooo _f_i_l_e,,,, --------oooouuuuttttppppuuuutttt====_f_i_l_e Save output to file _f_i_l_e instead of printing it to ssssttttddddoooouuuutttt. ----ssss _s_t_a_t_e,,,, --------ssssttttaaaatttteeee====_s_t_a_t_e Start execution from state ssssttttaaaatttteeee. This definition overwrites start state resolved from the ssssttttaaaarrrrtttt Page 1 (printed 7/30/98) SSSSTTTTAAAATTTTEEEESSSS((((1111)))) SSSSTTTTAAAATTTTEEEESSSS ((((JJJJuuuunnnn 6666,,,, 1111999999997777)))) SSSSTTTTAAAATTTTEEEESSSS((((1111)))) block. ----VVVV,,,, --------vvvveeeerrrrssssiiiioooonnnn Print ssssttttaaaatttteeeessss version and exit. ----WWWW _l_e_v_e_l,,,, --------wwwwaaaarrrrnnnniiiinnnngggg====_l_e_v_e_l Set the warning level to _l_e_v_e_l. Possible values for _l_e_v_e_l are: lllliiiigggghhhhtttt light warnings (default) aaaallllllll all warnings SSSSTTTTAAAATTTTEEEESSSS PPPPRRRROOOOGGGGRRRRAAAAMMMM FFFFIIIILLLLEEEESSSS SSSSttttaaaatttteeeessss program files can contain on _s_t_a_r_t block, _s_t_a_r_t_r_u_l_e_s and _n_a_m_e_r_u_l_e_s blocks to specify the initial state, _s_t_a_t_e definitions and _e_x_p_r_e_s_s_i_o_n_s. The _s_t_a_r_t block is the main() of the ssssttttaaaatttteeeessss program, it is executed on script startup for each input file and it can perform any initialization the script needs. It normally also calls the cccchhhheeeecccckkkk____ssssttttaaaarrrrttttrrrruuuulllleeeessss(((()))) and cccchhhheeeecccckkkk____nnnnaaaammmmeeeerrrruuuulllleeeessss(((()))) primitives which resolve the initial state from the input file name or the data found from the begining of the input file. Here is a sample start block which initializes two variables and does the standard start state resolving: start { a = 1; msg = "Hello, world!"; check_startrules (); check_namerules (); } Once the start block is processed, the input processing is continued from the initial state. The initial state is resolved by the information found from the _s_t_a_r_t_r_u_l_e_s and _n_a_m_e_r_u_l_e_s blocks. Both blocks contain regular expression - symbol pairs, when the regular expression is matched from the name of from the beginning of the input file, the initial state is named by the corresponding symbol. For example, the following start and name rules can distinguish C and Fortran files: namerules { /.(c|h)$/ c; /.[fF]$/ fortran; } Page 2 (printed 7/30/98) SSSSTTTTAAAATTTTEEEESSSS((((1111)))) SSSSTTTTAAAATTTTEEEESSSS ((((JJJJuuuunnnn 6666,,,, 1111999999997777)))) SSSSTTTTAAAATTTTEEEESSSS((((1111)))) startrules { /- [cC] -/ c; /- fortran -/ fortran; } If these rules are used with the previously shown start block, ssssttttaaaatttteeeessss first check the beginning of input file. If it has string ----****---- cccc ----****----, the file is assumed to contain C code and the processing is started from state called cccc. If the beginning of the input file has string ----****---- ffffoooorrrrttttrrrraaaannnn ----****----, the initial state is ffffoooorrrrttttrrrraaaannnn. If none of the start rules matched, the name of the input file is matched with the namerules. If the name ends to suffix cccc or CCCC, we go to state cccc. If the suffix is ffff or FFFF, the initial state is fortran. If both start and name rules failed to resolve the start state, ssssttttaaaatttteeeessss just copies its input to output unmodified. The start state can also be specified from the command line with option ----ssss, --------ssssttttaaaatttteeee. State definitions have the following syntax: ssssttttaaaatttteeee {{{{ _e_x_p_r {_s_t_a_t_e_m_e_n_t_s} ... } where _e_x_p_r is: a regular expression, special expression or symbol and _s_t_a_t_e_m_e_n_t_s is a list of statements. When the expression _e_x_p_r is matched from the input, the statement block is executed. The statement block can call ssssttttaaaatttteeeessss' primitives, user-defined subroutines, call other states, etc. Once the block is executed, the input processing is continued from the current intput position (which might have been changed if the statement block called other states). Special expressions BBBBEEEEGGGGIIIINNNN and EEEENNNNDDDD can be used in the place of _e_x_p_r. Expression BBBBEEEEGGGGIIIINNNN matches the beginning of the state, its block is called when the state is entered. Expression EEEENNNNDDDD matches the end of the state, its block is executed when ssssttttaaaatttteeeessss leaves the state. If _e_x_p_r is a symbol, its value is looked up from the global environment and if it is a regular expression, it is matched to the input, otherwise that rule is ignored. The ssssttttaaaatttteeeessss program file can also have top-level expressions, they are evaluated after the program file is parsed but before any input files are processed or the _s_t_a_r_t block is evaluated. Page 3 (printed 7/30/98) SSSSTTTTAAAATTTTEEEESSSS((((1111)))) SSSSTTTTAAAATTTTEEEESSSS ((((JJJJuuuunnnn 6666,,,, 1111999999997777)))) SSSSTTTTAAAATTTTEEEESSSS((((1111)))) PPPPRRRRIIIIMMMMIIIITTTTIIIIVVVVEEEE FFFFUUUUNNNNCCCCTTTTIIIIOOOONNNNSSSS ccccaaaallllllll ((((_s_y_m_b_o_l)))) Move to state _s_y_m_b_o_l and continue input file processing from that state. Function returns whatever the ssssyyyymmmmbbbboooollll state's terminating rrrreeeettttuuuurrrrnnnn statement returned. cccchhhheeeecccckkkk____nnnnaaaammmmeeeerrrruuuulllleeeessss (((()))) Try to resolve start state from nnnnaaaammmmeeeerrrruuuulllleeeessss rules. Function returns 1111 if start state was resolved or 0000 otherwise. cccchhhheeeecccckkkk____ssssttttaaaarrrrttttrrrruuuulllleeeessss (((()))) Try to resolve start state from ssssttttaaaarrrrttttrrrruuuulllleeeessss rules. Function returns 1111 if start state was resolved or 0000 otherwise. ccccoooonnnnccccaaaatttt ((((_s_t_r,,,, ............)))) Concanate argument strings and return result as a new string. ffffllllooooaaaatttt ((((_a_n_y)))) Convert argument to a floating point number. ggggeeeetttteeeennnnvvvv ((((_s_t_r)))) Get value of environment variable _s_t_r. Returns an empty string if variable _v_a_r is undefined. iiiinnnntttt ((((_a_n_y)))) Convert argument to an integer number. lllleeeennnnggggtttthhhh ((((_i_t_e_m,,,, ............)))) Count the length of argument strings or lists. lllliiiisssstttt ((((_a_n_y,,,, ............)))) Create a new list which contains items _a_n_y, ... ppppaaaannnniiiicccc ((((_a_n_y,,,, ............)))) Report a non-recoverable error and exit with status 1111. Function never returns. pppprrrriiiinnnntttt ((((_a_n_y,,,, ............)))) Convert arguments to strings and print them to the output. rrrraaaannnnggggeeee ((((_s_o_u_r_c_e,,,, _s_t_a_r_t,,,, _e_n_d)))) Return a sub-range of _s_o_u_r_c_e starting from position _s_t_a_r_t (inclusively) to _e_n_d (exclusively). Argument _s_o_u_r_c_e can be string or list. rrrreeeeggggeeeexxxxpppp ((((_s_t_r_i_n_g)))) Convert string _s_t_r_i_n_g to a new regular expression. Page 4 (printed 7/30/98) SSSSTTTTAAAATTTTEEEESSSS((((1111)))) SSSSTTTTAAAATTTTEEEESSSS ((((JJJJuuuunnnn 6666,,,, 1111999999997777)))) SSSSTTTTAAAATTTTEEEESSSS((((1111)))) rrrreeeeggggeeeexxxxpppp____ssssyyyynnnnttttaaaaxxxx ((((_c_h_a_r,,,, _s_y_n_t_a_x)))) Modify regular expression character syntaxes by assigning new syntax _s_y_n_t_a_x for character _c_h_a_r. Possible values for _s_y_n_t_a_x are: ''''wwww'''' character is a word constituent '''' '''' character isn't a word constituent rrrreeeeggggmmmmaaaattttcccchhhh ((((_s_t_r_i_n_g,,,, _r_e_g_e_x_p)))) Check if string _s_t_r_i_n_g matches regular expression _r_e_g_e_x_p. Functions returns a boolean success status and sets sub-expression registers $$$$_n. rrrreeeeggggssssuuuubbbb ((((_s_t_r_i_n_g, _r_e_g_e_x_p,,,, _s_u_b_s_t)))) Search regular expression _r_e_g_e_x_p from string _s_t_r_i_n_g and replace the matching substring with string _s_u_b_s_t. Returns the resulting string. The substitution string _s_u_b_s_t can contain $$$$_n references to the _n:th parenthesized sup-expression. rrrreeeeggggssssuuuubbbbaaaallllllll ((((_s_t_r_i_n_g, _r_e_g_e_x_p,,,, _s_u_b_s_t)))) Like rrrreeeeggggssssuuuubbbb but replace all matches of regular expression _r_e_g_e_x_p from string _s_t_r_i_n_g with string _s_u_b_s_t. sssspppplllliiiitttt ((((_r_e_g_e_x_p,,,, _s_t_r_i_n_g)))) Split string _s_t_r_i_n_g to list considering matches of regular rexpression _r_e_g_e_x_p as item separator. sssspppprrrriiiinnnnttttffff ((((_f_m_t, ...) Format arguments according to _f_m_t and return result as a string. ssssttttrrrrccccmmmmpppp ((((_s_t_r_1,,,, _s_t_r_2)))) Perform a case-sensitive comparision for strings _s_t_r_1 and _s_t_r_2. Function returns a value that is: ----1111 string _s_t_r_1 is less than _s_t_r_2 0000 strings are equal 1111 string _s_t_r_1 is greater than _s_t_r_2 ssssttttrrrriiiinnnngggg ((((_a_n_y)))) Convert argument to string. ssssttttrrrrnnnnccccmmmmpppp ((((_s_t_r_1,,,, _s_t_r_2,,,, _n_u_m)))) Perform a case-sensitive comparision for strings _s_t_r_1 and _s_t_r_2 comparing at maximum _n_u_m cccchhhhaaaarrrraaaacccctttteeeerrrrssss.... ssssuuuubbbbssssttttrrrriiiinnnngggg ((((_s_t_r,,,, _s_t_a_r_t,,,, _e_n_d)))) Page 5 (printed 7/30/98) SSSSTTTTAAAATTTTEEEESSSS((((1111)))) SSSSTTTTAAAATTTTEEEESSSS ((((JJJJuuuunnnn 6666,,,, 1111999999997777)))) SSSSTTTTAAAATTTTEEEESSSS((((1111)))) Return a substring of string _s_t_r starting from position _s_t_a_r_t (inclusively) to _e_n_d (exclusively). BBBBUUUUIIIILLLLTTTTIIIINNNN VVVVAAAARRRRIIIIAAAABBBBLLLLEEEESSSS $$$$.... current input line number $$$$_n the _nth parenthesized regular expression sub- expression from the latest state regular expression or from the rrrreeeeggggmmmmaaaattttcccchhhh primitive $$$$```` everything before the matched regular rexpression. This is usable when used with the rrrreeeeggggmmmmaaaattttcccchhhh primitive; the contents of this variable is undefined when used in action blocks to refer the data before the block's regular expression. $$$$BBBB an alias for $$$$```` aaaarrrrggggvvvv list of input file names ffffiiiilllleeeennnnaaaammmmeeee name of the current input file pppprrrrooooggggrrrraaaammmm name of the program (usually ssssttttaaaatttteeeessss) vvvveeeerrrrssssiiiioooonnnn program version string FFFFIIIILLLLEEEESSSS /usr/freeware/share/enscript/enscript.stenscript's states definitions SSSSEEEEEEEE AAAALLLLSSSSOOOO awk(1), enscript(1) AAAAUUUUTTTTHHHHOOOORRRR Markku Rossi <mtr@iki.fi> <http://www.iki.fi/~mtr/> GNU Enscript WWW home page: <http://www.iki.fi/~mtr/genscript/> Page 6 (printed 7/30/98)